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INTRODUCTION 



During the period 14 February 1977-16 February 1977, 
meetings were held at the Naval Postgraduate School (NPS) to 
discuss the TPQ-27 PSVT. Participating in this meeting were 
Major Earl Peete (MAD, Pt. Mugu) , Major Dave Allen (MCTSSA, 

Camp Pendleton) , Capt. Jerry Paccassi and myself (both at NPS) . 
Also in attendance were Mike Pa trow and Mike Lowe, students at 
NPS. A test concept was developed which called for bomb drops 
with 18 cells in a "base line" group, together with additional 
"demonstration" drops, conducted under eight additional 
combinations of conditions. These combinations are shown in 
Figure 1. Within each cell of the design for baseline drops, a 
test is to be made of whether contract specified CEP’s have 
been met. 

In what follows, we discuss the design, certain aspects 
of performing the trials in the field, and an outline of the 
Analysis procedure proposed for testing CEP's and making other 
inferences from the test data. Some of these comments came out 
of discussions at the NPS meeting, and others are suggestions 
and observations by the author. 

2. THE STATISTICAL DESIGN 

It is desirable to test the TPQ-27 over a wide range 
of levels of the variables involved, in order to facilitate 
inference about performance characteristics of this system 
and its sensitivity and response to variations in drop conditions. 
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Demonstration drops 
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FIGURE 1. Combination of conditions under which drops are 
planned in the PSVT. 
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However, this testing, involving dropping bombs on an instru- 
mented range, is expensive. Thus there is also a conflicting 
desire to hold the sample size as low as possible, consistent 
with achieving reasonable confidence in the tests and in the 
inferences to be made. For this reason, a sample of baseline 
conditions was established, in which most of the drops are to 
be made. The baseline cases were selected so as to cover a 
fairly large portion of operationally realistic conditions. 

The data resulting from these baseline drops will allow testing 
against contract specified CEP's in each cell, as well as sub- 
sequent analyses such as testing whether there are significant 
differences due to the factors range, altitude, range x 
altitude interaction, speed, mode, speed x mode interaction 
and speed x altitude interaction. In addition, estimates of 
the type and amount of response to changes in the main effects 
(for Auto mode) can be made. For the demonstration cells, 
tests against contract specified CEP's can also be made. 

The nature of the tests of CEP has not been completely 
determined at this time, but appears to have been narrowed 
down to several candidates. Sequential testing within each 
cell of the design appears attractive because of the expected 
savings in numbers of bomb drops. In Section 4 below we out- 
line two possible sequential procedures (called "sequential 
Rayleigh" and "sequential nonparametric" ) as well as two fixed 
sample size procedures (called "fixed Rayleigh" and "fixed non- 
parametric") . Sample size characteristics of the sequential 
and fixed-sample size tests are shown in Tables 1 and 2. 
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TABLE 2 . Sequential and Fixed Rayleigh Characteristics for Several a, 3^ c. Combinations. 



The column heads in Tables 1 and 2 are as follows: 



a 


: probability the test rejects H^tCEP = Cq 

in favor of H^:CEP = C^, when in fact the 
system has CEP = Cq . 


3 


: probability the test accepts Hq when in 

fact the system has CEP = C^. 


CEP^/CEPq 


: ratio of minimum unacceptable CEP to con- 

tract specified CEP. 


rain accept 


: the smallest possible sample size at 

termination with acceptance of Hq (i.e., 
the sample size required to accept even a 
perfect system) . 


rain reject 


; the smallest sample size possible for re- 
jecting Hq (for nonparametric sequential 
procedure only — for the sequential Rayleigh 
procedure, the min reject number is 1 for 
all cases). NOTE: for the nonparametric 
case, round up to integer values where 
necessary. 


Low reject 


: the sample size required for rejection 

if all radial misses fell at distance CEP^^ 
from the target (for sequential Rayleigh 
only) . 
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Max E(N) ; 


the worst case expected sample size for 
the sequential procedures (this occurs for 
some true system CEP between CEPq and CEP^^) . 


Typical N 


average of sequential tests expected sample 
sizes under Hq and under 


N fixed : 


sample size required by the fixed-sample 
size procedures. 


slope : 


slope of lines forming boundaries of the 
continuation region for sequential procedures 


Accept intercept : 


the y-intercept of the boundary line defining 
the accept region for sequential procedures. 


Reject intercept: 


the y-intercept of the boundary line defining 
the reject region for sequential procedures. 
NOTE: for the sequential nonparametric 

procedures, the y-intercepts are symmetric 
if a = 3; otherwise the x-intercept of 
the rejection line is given under "Min reject 


Max 3a„ : 

N 


three times the max E(N) . This is roughly 
two standard deviations above the expected 
sample size--virtually none of the tests 
should continue beyond this value. 
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The values shown in Tables 1 and 2 pertaining to 



sequential tests were obtained using Walds' approximations, 
and are therefore slightly conservative. Exact stopping bounds 
are available for these tests (for example those prepared by 
Leo A. Aroian at TRW Systems, Redondo Beach, California), and 
they should be used if the sequential approach to testing CEP 
is adopted. Truncation of the sequential test was considered, 
but it appears undesirable for several reasons: 1) truncation 

increases average sample sizes, 2) truncation complicates the 
computation of acceptance and rejection bounds (although, 
again, tables m^ay be available covering most of our cases) , 
and 3) the terminal decision for cases reaching the truncation 
point is somewhat arbitrary. In addition, for the a, 3, 
CEPj,/CEPq combinations, we can realistically anticipate 
(see Tables 1 and 2 with a and 3 on the order of 0.10 and 
CEP^/CEPq about 2, for example), max (which is essentially 

an upper bound on sample size N) is not unacceptably large, 
in view of the fact that over the many cells of the design, 
with an individual sequential test being performed in each 
cell, the overall average sample size per cell will almost 
certainly fall below max E(N). Consequently, it is felt that 
trunction would only cause unnecessary increase in overall 
drop requirements for the entire test sequence. 

An alternative to untruncated sequential testing is 
to use fixed sample size tests. This has the effect of 
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balancing the number of drops in the various cells of the 
design matrix, which is desirable for the subsequent analyses 
concerning differences due to the various factors. However, 
as with truncation of the sequential procedures, the overall 
sample size requirements are larger for fixed sample size tests. 
It is our feeling that the balance in design achieved by fixed 
sample size testing is far outweighed by its disadvantage with 
respect to overall sample size requirement. As is discussed 
in the succeeding section, the way in which the field tests 
may be carried out will tend to balance the design even with 
sequential testing in each cell, and this further points to 
superiority of using sequential testing. 

3. OUTLINE OF FIELD TEST PROCEDURES 

In order to avoid loosing efficiency in the PSVT, it 
is desirable that drops be conducted in such a way as to avoid 
(as much as possible) confounding factors suspected to affect 
system performances, and to provide "insurance" against bias 
in results due to unknown causes. Ideally this would be in 
part accomplished by scheduling individual drops over the 
various cells of the design using a formal randomization pro- 
cedure. This might mean, for example, that a single flight 
(operation) would call for first dropping a bomb at 300 kts , 

20k ft altitude at 20 mi range in Auto mode, next dropping a 
bomb at 500 kts, 10k altitude at 55 mi range in Auto mode. 
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and so on for the remaining bombs to be dropped in this 
operation. Clearly such a schedule may not be practical, so 
constraints must be imposed on the scheduling process. The 
author is not in a position to assess what constraints are 
necessary, but he wishes to point out the desirability of 
imposing as little constraint as possible. 

In order to gain appreciation of the possible effects 
of confounding mentioned above, consider an example test 
schedule in which the first group of operations are all conducted 
at 500 kts, 10k altitude, 20 mi range. Auto mode. These drops 
might be followed by operations all at 500 kts, 20k, 20 mi, auto, 
etc. Suppose, moreover, each individual operation (consisting 
of eight bombs) is constrained such that all eight bombs are 
dropped under the same conditions (in the same cell of the 
design) . Then factors having to do with each individual 
operation (such as radar alignment, pilot effect, wind profile 
errors, etc.), whose effects for the given operation may be 
unknown or only partially known (even using ARIS) , cannot be 
"balanced out" ; rather they may cause bias of an amount un- 
determinable by the experimenter and analyst. Simiarly, con- 
ducting operations all with fixed combinations of speed, 
range, etc. close together in time would preclude balancing 
out unknown long term trend effects (if any) . 

There is another reason why allowing drops in different 
cells in a single operation would be desirable. If a sequential 
test plan is adopted for CEP testing, forcing observations to 
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be made in batches of eight (say) in a given cell of the 
design rather than one at a time (i.e., no closer together 
in time than the miss distance determination turn-around time) 
will generally lead to larger than necessary sample sizes-- 
perhaps substantially larger. As a rough assessment of the 
effect of such "batch" testing relative to ordinary sequential 
testing, consider the nonparametric sequential test with 
a = 3 = .1 and CEPj^/CEPq = 2. Then the "typical" expected 
sample size is about 6.3. Imagine for the moment sample size 
N is roughly exponentially distributed (which is certainly 
an oversimplification but is consistent with the observation 
that in many cases the mean and standard deviation of N are 
about the same, and is adequate for the present discussion) . 

With batches of size eight, one batch would be required with 
a probability on the order of .7, two batches with probability 
about .2 and three batches with probability roughly .1. Thus 
the expected number of batches required would be about 1.4, 
or roughly 11 drops per cell on the average. Thus the effect 
of batch arrivals of observations in each cell is an increase 
in total drops for the experiment, perhaps by as much as 75%. 

In summary, the implication of the foregoing discussion 
is that it may well be worth expending test resources to allow 
individual drops in more than one cell within a given operation. 
In addition, variables such as aircraft heading, time of day, 
order within the overall test sequence, weather, etc. should 
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"be varied" as much as practicable within a given cell of the 
design (that is, have as many variations and combinations of 
levels as practical associated with the drops in each given 
cell) . This may be viewed as "buying insurance" against 
unforeseen effects of unknown causes in the experiment; in 
addition, such an approach may allow deduction of probable 
causes of system misbehavior in some cases of importance, should 
such difficulty be experienced in the PSVT. 

Final comments on the field conduct that the author 
would like to mention are that there should be no possibility 
of specialized "tweeking" of the system (by either test 
personnel or the contractor) to alter its performance in any 
way for the tests. This may involve careful monitoring of any 
software changes, for example. Secondly, if the sequential 
approach to CEP testing is to be adopted, there should be a 
mechanism for assessing each drop miss distance (or hit-miss 
ourcome) in a period of time which is short relative to the 
following time interval standards. If individual drops are 
continued within a cell with a given operation only until 
sequential termination, the standard is the operation duration 
(hours?) . If batch testing is used within each cell of the 
design, the standard would be time between operations (days?) . 

If the individual drops within each operation are allocated 
to various cells of the design (which I recommend if at all 
possible) , the standard is the time spent at a given range 
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(weeks?) . Thus in the latter case there is perhaps not a 
measurement "turn-around time" problem at all, an additional 
bonus in taking this approach. 

4. STATISTICAL ANALYSIS PLAN 

There are two levels of analysis in the PSVT plan. 

The primary goal is to test whether system performance in each 
cell of the design is within design specifications. The secondary 
analyses concern determining which factors have significant 
effect, and what the effects are. 

For the primary tests of CEP, there appear to be 
several alternatives: sequential parametric test (SPT) , sequential 

nonparametric test (SNT) , fixed-sample size parametric test 
(FPT) and fixed-sample size nonparametric test (FNT) . The 
SPT and SNT are discussed in an earlier report [1] and we thus 
give only a very brief comment on them here. The FPT and FNT 
are discussed below. All of the tests involve testing whether 
the system displays accuracy (in each given cell of the design) 
to within the contract specified CEP, say CEPq, or whether it 
has performance worse than some minimally acceptable performance 
(CEPj^) . Thus the tests may be developed as tests of Hq :M = CEPq 
vs H ;M = CEP, , where M denotes the true (population) median 

Si 

radial miss distance of the system under the conditions of 
the given cell of the design. Both of the sequential test pro- 
cedures are applications of Wald's Sequential Probability Ratio 
Test (SPRT) . One, the SPT, is based on sequentially observing 
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(within each cell of the design) observed radial miss 
distances, and assuming a Rayleigh distribution model. The 
SNT is based on observing only whether each drop falls within 
CEPq and assuming a binomial distribution model. The SPT 
requires smallest average sample size, but its validity depends 
on whether the Rayleigh assumption is tenable (the latter 
assumption is implied by the assumption impact on the target 
plane follow a circular normal distribution, for example) . 

The SNT requires somewhat larger samples on the average than 
does the SPT, but the binomial model involved is far less open 
to criticism on the grounds of invalidity due to assumption of 
distribution of radial miss distance. 

The fixed sample size procedures are also based on 
the respective stochastic models (Rayleigh and binomial) . If 
we consider the equivalent hypotheses about median squared 
radial miss distance and measure the squared radial miss distance 
of each drop, the Rayleigh model transforms to a chi-squared 
model which is somewhat more tractable computationally. In 
what follows we describe the FPT in these terms. 

2 

Suppose R is distributed Rayleigh so R is dis- 

2 2 

tributed exponential with mean C /In 2, where C is squared CEP. 

2 2 

The likelihood ratio test of H^:median(R ) = vs 

^ ^ N 

2 2 ^^2 
H :median(R^) = C, is based on the test statistic T = Y R. . 

iii 3 

Hq is rejected whenever the calculated T is sufficiently 

2 

large. Under Hq, [{2ln 2)/Cq]T is distributed chi-squared 
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with 2N degrees of freedom 



/ 2 jin 2 _ 2 \ 

( cl • 

Thus the FPT procedure for each cell of the design is to observe 

N radial miss distances, , ... , R^^. Calculate the 

^ 2 

sum of squares, T = I R. . Reject H„ if T exceeds 

i=l ^ ^ 

0 2 

2 £n 2 ^(l-a;2N) ' 

2 . 2 
where (l-a)100% point in the X^2 n) 

2 

For example, with = 1 (i.e., R^ measured in CQ-units) , 

N = 9 and a = 0.10, this critical value is 18.747. 

The FNT is a test of hypotheses about a binomial 
parameter p; H_:p > 1/2 vs H :p < 1/2, where p represents 
the probability an impact falls within the contract specified 
CEP, say CEPq. Assuming independence among the bomb impacts 
(see comments in Section 3 above) , the number X of "hits" 
(impacts with R^ £ CEPq) in N drops is binomially distri- 
buted; further, under the null hypothesis it is binomial 
with parameter 1/2 (X ^ b(N,l/2)). The null hypothesis should 
be rejected if the observed value of X is on or below b^ 

where b „ is the largest value (obtained from the b(N,l/2) 
a,N 

tables) such that P[X _< b^ £ a. For example, with 
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a = 0.10 and N = 12, this critical value is 3. Note: due 

to the discreteness of the binomial distribution, this pro- 
cedure is somewhat conservative, in that the actual type-1 
error probability for this example is .073, rather than the 
desired value, 0.10. If an exact test is desired, a randomized 
decision rule can be used (see E. Lehmann [3] for details) . 

The tests of CEP within each cell, discussed above, 
constitute the primary goal of the PSVT. Secondary goals 
include analyses of effects of various factors included in 
the design. An analysis of variance (AOV) is planned, using 
data from the baseline trials. These types of cells in the 
design received relatively greater numbers of drops, and form 
a factorial arrangement (with some unbalance in sample size) . 

It is anticipated that the analysis of variance will be based 
on (log R^) data, the log transformation being used to 
stabilize variance over the cells, a condition required in 
analysis of variance. To see the appropriateness of this 
transformation, consider the type of distribution that is 
likely to be sampled through observing radial miss distances 

, R 2 , ... , Rj^ within a cell of the design. We anticipate 
2 2 

that R^ ~ ^*^(2)' I't'eans "approximately distributed 

2 

as" and k is a constant proportional to CEP . Then 

2 2 2 ... 

E(R.) 2; 2k and V(R.) ~ 4k so the standard deviation in a 
1 1 

given cell is approximately proportional to the mean, i.e., 
a = ky = h(y), where h is linear. Then the transformation 
g given by 
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/ 2 , 

g(r ) 



/-i 

J h(r 



/ 2 , 

(r ) 



dr' 




£n 



r 



2 



is commonly used to make o constant over varying values of 
y (see Curtiss [2], for example). But In r^ a £n r, hence 
analysis of variance can be performed on log data. Appro- 

priateness of this transformation can be assessed once the 
experimentation data are available. 

If the speed and altitude levels actually attained in 
the trials vary substantially (say more than 10%) from the levels 
specified in the design matrix, one or both of these factors 
may be incorporated as covariates in an Analysis of Covariance 
(AOC) , rather than the analysis of variance described above. 
Again, determination of whether this is necessary or desirable 
can be made once the experimentation data are available. For 
this purpose, the data arising from each drop should be in a 
format which includes measured values of speed and altitude. 

In addition to the AOV or AOC, secondary analysis may 
include fitting a response surface to the observed drop data. 

This could be done using regression (perhaps weighted to 
accommodate inhomogeneity of variance) to estimate a surface 
giving system accuracy as a function of the variables altitude 
range and possibly speed, for the system in the Auto mode. 

Terms in the model should be selected so known and anticipated 
physical system characteristics and target/range characteristics 
are likely to be adequately represented. Although the dependent 
variable could be taken to be sample CEP in each cell, a better 
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model might result from modeling squared radial miss distance 
via the regression, then transforming predictions with this 
model to CEP predictions, if desired, using the Rayleigh-based 
relationship. 

Finally, additional analyses (such as pairwise compari- 
sons, cell CEP estimates, patterns of trial "aborts" and 
"outlier" rejections, etc.) and presentations of summary data 
should be undertaken. The precise nature of these analyses 
has not been explored as yet, and to a large extent will depend 
on the data obtained. Close coordination with test personnel 
should also be maintained by the analyst, in order to assist 
in determining what additional analyses would be appropriate. 

It is planned to use the ARIS system to assist in 
determining causes for observed large misses. This procedure 
constitutes an "outlier" rejection rule, which could bias the 
experiment, as follows. If only large miss drops are subjected 
to the ARIS screening, the overall effect will be to possibly 
eliminate some of the large misses, which in turn makes the 
remaining drops appear more accurate. Such screening may be 
appropriate; however, we suggest two actions which may assist 
in determining whether biasing has occurred. First, records 
of any such eliminated drops should be kept, for possible 
subsequent analysis. Second, the ARIS screen should be applied 
formally to a sample of "good" drops, using the same rejection 
criteria as for the outlier cases. Records should be kept 
of the results of such screening of "good" drops. These can 
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be used to help assess the degree of bias that may have been 
induced by elimination of drops with large miss distances 
that were not actually outliers. 

With the large number of individual tests being 
performed with the primary analysis (i.e., one CEP test in 
each cell of the design) , it is likely that there will be a 
mixture of rejections and acceptances of the contract specified 
CEP's. There will occur, therefore, the problem of making an 
overall assessment of whether the system is sufficiently 
accurate. It would be a good idea to explore this problem with 
the decision maker, and to indicate how changing the Type I 
and Type II error rates (a and 3/ respectively) affect the 
accept/re ject patterns that may be encountered. Perhaps the 
significance of the observed number of rejections can be 
assessed in terms of physical explanation of system patterns, 
as well as the conditions anticipated in actual operational 
use of the system. The binomial distribution may be of some 
use in determining whether the number of rejections is 
significant, or perhaps Fisher's method [4] of combining experi- 
mental results can be used. 

It should be borne in mind that theoretically the 
secondary analyses may be affected by the stopping rule used 
in the primary tests. If sequential tests are used for the 
primary analysis, the data in each cell are, in a mild sense, 
conditional, given the data obtained led to acceptance or 
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rejection, as the case may be. It is not anticipated that 
this simultaneous inference effect will be great enough to 
cause difficulties from a practical point of view, however. 

5. A SAMPLE SIZE REDUCTION METHOD 

We have argued elsewhere [1] that the major shortcoming 
of the Rayleigh model for unguided weapon misses is that in 
some applications it fails to adequately fit the upper tail 
of the miss distance distribution. Even in such cases, however, 
the model may provide useful results for the major portion of 
the miss distribution short of the very large misses. In what 
follows we describe such an application of the Rayleigh distri- 
bution to reduce sample size required in the primary analyses 
concerning CEP testing. This approach is applicable to both 
the SNT and FNT. Throughout, we assume the Rayleigh model 
provides reasonable fit to the radial miss distribution except 
possibly for the upper tail region (which we define here as 
the set of points larger than the upper 95% point in the 
Rayleigh distribution) . 

Suppose, then, under fixed conditions the squared 
radial miss cumulative distribution function is 

F (y) = 1 - exp(-y In 2/C^) , y > 0 , 

R 

2 2 2 
where C is the median of R (i.e., C is the square of 
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2 

the system CEP) . Let denote the squared CEP under the 

null hypothesis H^tCEP = Cq and assume the alternative 
hypothesis is H^:CEP “ = kC^. For convenience in notation, 

assume miss distances are measured in C^-units, so Cq = 1 , 
and k represents the ratio. Recall both the SNT 

and FNT are based on the binomial distribution of the number 
of hits inside a circle of radius 1 (= Cq) . Under the null 
hypothesis the probability of hitting this circle is 

F (y) = 1 - exp(- £n 2/1^) = 0.5 
R 

and under H the probability of such a hit is 

3 . 

F (1) = 1 - exp(- In 2/k^) . 

R 

For example, with k = 2 this probability is 1 - exp(£n 2/2) 

~ .2929. 

The basic idea we wish to discuss is that of allowing 
the definition of "hit" to be associated with circles of radii 
possibly different from Cq . We shall show that even though 
we maintain the null and alternate hypotheses about CEP 
described above, the binomial data to test these hypotheses 
can be made be far more efficient by defining the hit/miss 
criterion differently. Let Pq(^) denote the probability 
under Hq of observing a miss distance within C units of 
the target, and similarly let Pj^(C) denote that probability 



under H . Then 
a 
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1 



Pq(C) = P[R^ < C^jCEP = 1] = 1 - exp(-C^ £n 2) 



-C 



2 



2 



and 



Pj^(C) = P[R^ < C^|CEP = k] = 1 - exp(-C^ In 2/k^) 





2 



We wish to determine C so as to minimize the sample size N 
(or in the sequential case, Expected sample size) required to 
achieve a test of Hq vs with preselected operating 

characteristics a and g. Our procedure is to express N 
as a function of C, then minimize. For ease of presentation 
we use the arcsine transformation of binomial random variables 
to normality [2], and limit ourselves to the Fixed sample 
size case (although neither of these conveniences is necessary) 



the test of H,. vs H would be based on X, the observed 
0 a 

relative frequency of hits. The null hypothesis is rejected 
for X sufficiently small. For any selected value of C, 
let p (C ) denote the corresponding probability an individual 
bomb results in a hit. For even moderate values of N, 



although the approximation may be quite rough if p is 
"extreme" (outside the interval (.05, .95) or so). Note: the 



With some radius C of the hit circle definition 




angle 2 sin is measured in radians. Now, in terms of 
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the test statistic Y, because 2 sin”^ is monotone 

increasing, should be rejected if Y ^ d, where the 

critical value d and sample size N are selected so that 
the desired size and power are attained: 

P[Y < d|C = Cq] = a , 

P[Y <_ d|C = C^] = 1 - B. 

Using the arcsine transformation described above, these conditions 
are met (at least to good approximation) provided 

d - 2 sin ^ = z //N , 

^0 a 

d - 2 sin ^ /p, = z, q//N , 

± i-p 

where z^ is the 6^^ quantile of the standard normal distri- 
bution. Thus in order to minimize N subject to meeting 
the a and 3 requirements it suffices to maximize 

fCPp) = sin“^ - sin”^ 

= sin 

This is easily done for various fixed values of the CEP^^/CEPp 
ratio k. Values of f(Po^ used to estimate FNT 



-1 



/Pq - sin 



-1 



- (i-p„) 



l/k' 
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sample size requirements through the approximation 



N 



^ 1-8 ~ 

2f(P0> 



As an example to demonstrate this idea, suppose 
a = 8= 0-10/J^ = 2. Then values of fCp^), the radius C 
of the "hit" circle, and approximate sample sizes for the FNT 
are as shown in Table 3 . 

The maximum of occurs at p^ = .94 and this 

theoretically minimizes N. Note, however, that Pq = *90 
yields the same savings in sample size and has the advantage 
of not involving the model so far into the upper tail as does 
the sample size minimizing value, .94. Note the sample size 
requirement with Pq “ substantially below the 

C = CEPq defined "hit" circle described in Section 4 in 
connection with the SNT and FNT, where p^ = .50. The relative 
reduction in approximate sample size requirements for the 
example discussed above are shown in Table 3. 

As mentioned above, this sample size reduction scheme 
can be used for both the SNT and FNT, although only the FNT 
case was illustrated by example. The approximation involved 
through the arcsine transformation is an inessential part of 
this development; it was used here for simplicity of demon- 
strating the potential of this approach. 
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hit" circle radius "hit" probability approxiraate % savings in 

under sample size sample size 
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6 . RECOMMENDATION 



Based on the information available, the following 

approach to the PSVT is recommended: use the SNT for primary 

CEP testing, possibly with reduction in E(N) using the 

method described in Section 5. However, if this reduction 

scheme is adopted, the definition of the "hit" circle should 

not be allowed to involve Pq values too extreme (i.e., 

2 

C values too far in the upper tail of the Rayleigh distri- 
bution) . Probably a reasonable upper bound for p^ is .90. 

The tests should be conducted so as to deliver indi- 
vidual drops in each cell of the design on different days, to 
the extent possible. Drops should be made in each cell so 
that Uncontrolled variables (such as day, time of day, heading, 
pilot, aircraft, weather, etc.) vary over as wide a span as 
practicable. As pointed out in the preceding, this approach 
yields the following advantages: (1) it gives observations 

which are more nearly independent; (2) it provides estimates 
of CEP which are more realistc; (3) it avoids the increase 
in sample size with batch testing; and (4) it may give more 
time to measure miss distances . 

Secondary analyses of the radial miss data, including 
(but not limited to) analysis of variance, analysis of co- 
variance, and multiple regression should be performed. 
Transformations to stabilize variance and weighted regression 
should be used if the data suggest there is lack of homogeneity 
of variance. 
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